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Abstract 

We consider the problem of information embedding where the encoder modifies a white Gaussian host signal 
in a power-constrained manner to encode the message, and the decoder recovers both the embedded message 
and the modified host signal. This extends the recent work of Sumszyk and Steinberg to the continuous-alphabet 
Gaussian setting. We show that a dirty-paper-coding based strategy achieves the optimal rate for perfect recovery 
of the modified host and the message. We also provide bounds for the extension wherein the modified host signal 
is recovered only to within a specified distortion. When specialized to the zero-rate case, our results provide the 
tightest known lower bounds on the asymptotic costs for the vector version of a famous open problem in distributed 
control — the Witsenhausen counterexample. Using this bound, we characterize the asymptotically optimal costs 
for the vector Witsenhausen problem numerically to within a factor of 1.3 for all problem parameters, improving 
on the earlier best known bound of 2. 

I. Introduction 

The problem of interest in this paper (see Fig. [T]) derives its motivation from an information-theoretic 
standpoint, as well as from a distributed-control perspective. Information-theoretically, the problem is an 
extension of an information embedding problem recently addressed by Sumszyk and Steinberg [ 1 ] — the 
encoder ensures that the decoder recovers the modified host signal X m perfectly, along with the message. 
Philosophically, the work in [[TJ is directed towards understanding how a communication problem changes 
when an additional requirement, that of the encoder being able to produce a copy of the reconstruction 
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at the decoder, is imposed on the system (in source coding context, the issue was explored by Steinberg 
in Q). The problem is also closely connected to other information theory problems [[3j-[[6|. We refer the 
interested reader to 0, where these connections are discussed in detail. 
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Fig. 1. The host signal S m is first modified by the encoder using a power constrained input U m . The modified host signal X m and the 
message M are then reconstructed at the decoder. The problem is to find the minimum distortion in reconstruction of X m given P, the 
power constraint, and R, the rate of reliable message transmission. 

In [[TJ, the authors assume that the host signal S m , the modified host signal (the channel input) X m and 
the channel output Y m are all finite -alphabet. In this paper, we consider the Gaussian version of their 
problem. The extension is non-trivial [8] because simple Fano's inequality-based techniques do not work 
for the infinite-alphabet formulation. Experience in infinite- alphabet problems might even suggest that 
(asymptotic) perfect reconstruction may be impossible because the problem is set in continuous space. 
Intriguingly, asymptotic perfect reconstruction is possible in our problem because the encoder can ensure 
that the modified host signal takes values in a discrete subset of the continuous space. We provide tight 
results characterizing the tradeoff between rate and power for perfect reconstruction. As is more natural in 
a continuous-alphabet setting, we relax the assumption of perfect recovery of the host signal by considering 
recovery within a specified nonzero distortion, and for this problem we provide upper and lower bounds 
on the tradeoff between rate, power and average distortion. 

The nonzero distortion problem is closely related to the vector version of a famous distributed control 
problem called the Witsenhausen counterexample [|9j — at zero communication rate, the two problems 
are the same [7]. The scalar counterexample is believed to be quite challenging (see for a survey of 
prior results showing why it is believed to be so). As a conceptual simplification, Grover and Sahai [|7| 



considered the long-blocklength limit of the counterexample. Further, they relaxed the requirement of 
obtaining a provably optimal strategy to the weaker objective of obtaining strategies that attain within 
a constant factor of the optimal cost. For the weighted sum of power and average distortion costs (see 
Section [II]), they then show that dirty-paper coding techniques attain within a factor of 2 of the optimal cost 
for all problem parameters (i.e. the weights and the variances of the random variables). Backing off from 



the infinite blocklength limit, Grover, Park and Sahai [ 10 1 then showed that similar constant-factor results 
can also be obtained for finite vector lengths, including the scalar case. The achievable strategy, which 
yields the upper bounds, now uses lattices instead of random codebooks. The lower bound is obtained by 
applying sphere-packing ideas from information theory to the bound of [|7J. 

The lower bound in this paper specialized to rate zero provides an improved lower bound to the costs 
of the vector Witsenhausen counterexample in the long-blocklength limit. Using this improved bound, we 
show that the ratio of upper and lower bounds is smaller than 1.3 regardless of the choice of the weights 
and the problem parameters. This is an improvement over the previously best known maximum ratio of 
two @. 

Control theory has long wrestled with the Witsenhausen counterexample. Because it is a canonical 
problem, a comprehensive distributed-control theory would necessarily include a good understanding of 
the counterexample. Information-theory has had long-standing canonical problems of its own. In a line of 



investigation started by Gupta and Kumar |TTJ, the question of the capacity of a large wireless network 
is studied. By restricting attention to obtaining just the scaling of the total capacity, the bar for what 
might constitute a reasonable information-theoretic solution was lowered. More recently, the calculation 
of channel capacity to within a finite number of bits^]for canonical information-theory problems (e.g. the 
interference channel p2|) has led to significant advances in understanding capacity for larger network 



communication problems [13|, |14|. The recent results on Witsenhausen's counterexample thus raise a 

'Our constant-factor results on control costs are closely related to results on bounded gap from capacity. A factor of 2 approximation in 
power would be a slightly stronger result than a |-bit approximation in the capacity of a real channel. 
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parallel hope in distributed control. 

II. Problem Statement 

The host signal S m is distributed W(0, <7 2 I), and the message M is independent of S m and distributed 
uniformly over {1,2,..., 2 mR }. The encoder £ rn maps (M, S m ) to X m by additively distorting S m using 
input U m of average power (for each message) at most P, i.e. E [||S m — X' m || 2 ] < mP. Additive white 
Gaussian noise Z m ~ A/"(0,cr 2 I), where a 2 = 1, is added to X m by the channel. The decoder V m maps 
the channel outputs Y m to both an estimate X m of the modified host signal X m and an estimate M of 
the message. 

Define the error probability e m (S m , V m ) = Pr(M ^ M). For the encoder-decoder sequence {£ m , U m }™ = 
define the minimum asymptotic distortion MMSE(P, R) as follows 

MMSE(P,R)= inf limsup — E 

We are interested in the tradeoff between the rate R, the power P, and MMSE(P, R). 

The conventional control-theoretic weighted cost formulation [|9) defines the total cost to be 

J= lfc 2 ||U m || 2 + -||X m -X m || 2 , (1) 
m m 

where k 6 IR + . The objective is to minimize the average cost, E [J] at rate R. The average is taken over 
the realizations of the host signal, the channel noise, and the message. At R = 0, the problem is the 
vector Witsenhausen counterexample ||7). 

III. Main Results 

A. Lower bounds on MMSE(P, R) 

Theorem 1: For the problem as stated in Section |n| for communicating reliably at rate R with input 
power P, the asymptotic average mean-square error in recovering X m is lower bounded as follows. For 



|X m - X r 



P > 2 2R - 1, 

MMSE(P, R) > inf sup \ I I J ^(1 - 7 )V 2 + 7 2 P - 2 7 (1 - 7 W 

7 (2) 

where max j— av^P, 22fi - 1 - p -°' 2 j < < o-^/p. For P < 2 2R — 1, reliable communication at rate R 
is not possible. 

Corollary 1: For the vector Witsenhausen problem with E [||U m || 2 ] < mP, the following is a lower 
bound on the MMSE in the estimation of X m . 
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MMSE(P, 0) > inf sup \ f ( « / — 2 f ^(1 - 7) V 2 + 7 2 P - 2 7 (1 - l)a si 

' (3) 

where e \—o\[P~,o\fP\. 

Proof: [ Of Theorem [7]/ For conceptual clarity, we first derive the result for the case R = (Corol- 
lary [T]). The tools developed are then used to derive the lower bound for R > 0. 

Proof: [Of Corollary |7f 

For any chosen pair of encoding map £ m and decoding map V m , there is a Markov chain S m — > X m — >■ 
Y m — >■ X m . Using the data-processing inequality 

/(S ro ;X m ) </(X m ;Y m ). (4) 

The terms in the inequality can be bounded by single letter expressions as follows. Define Q as a random 
variable uniformly distributed over {1,2,..., m}. Define S = Sq, U = Uq, X = Xq, Z = Zq, Y = Yq 



and X = Xq. Then, 

j( X m ;Y m ) = h(Y m ) - h(Y m \X m ) 

< ^2h{Yi) -h{Y m \X m ) 

i 

= j2 h w - h Pi\Xi) 

i 

= Y^I(X ti Yi) 

i 

= mI{X-Y\Q) 

= m(h{Y\Q)-h{Y\X,Q)) 

< m(h(Y)-h(Y\X,Q)) 

= m(h(Y) - h(Y\X)) =mI(X;Y), (5) 

where (a) follows from an application of the chain-rule for entropy followed by using the fact that 
conditioning reduces entropy, and (b) follows from the observation that the additive noise Z\ is iid across 
time, and independent of the input X t (thus Y X Q\X). Also, 

/(S m ;X m ) = h{S m ) - h{S m \X m ) 

= ^2h(Si) - h{S m \X. m ) 

i 
i 

= ^2l(S i ;X i )=mI(S;X\Q) 

i 

= m(h(S\Q)-h(S\X,Q)^ 

> m (h(S) -h(S\X)\ = mI(S;X), (6) 

where (a) and (b) again follow from the fact that conditioning reduces entropy, and (b) also uses the 
observation that since Si are iid, S, Si, and S\Q = q are distributed identically. 
Now, using @, (H) and ©, 

mI(S; X) < I{S m ; X m ) < J(X m ; Y m ) < mI(X; Y). (7) 



Also observe that from the definitions of S, X, X and Y, E [d(S m , X m )] = E [d(S, X)}, and E d(X m , X m ) | = 
E \d(X,X) 

Using the Cauchy-Schwartz inequality, the correlation a su — E [SU] must satisfy the following con- 
straint, 

\a su \ = |E [SET] | < [S 2 ]y/E [U 2 ] < a^P. (8) 

Also, 

E [X 2 ] = E [(S + Uf] = a 2 + P + 2a su . (9) 

Since Z = Y — X X X, and a Gaussian input distribution maximizes the mutual information across an 
average-power-constrained AWGN channel, 

I(X,Y)<\^(l + P + °^ 2 ^ ). (10) 

I(S;X) = h{S)-h(S\X) 

= h(S) -h(S - 1 X\X)\J 1 

> h(S) - h(S - 7 X) 

= Uog 2 (2n e( r 2 ) -h(S-iX), (11) 

where (a) follows from the fact that conditioning reduces entropy. Also note here that the result holds 
for any 7 > 0, and in particular, 7 can depend on asu- Now, 

h(S-^X) = h(S - 7 (X - X) - 1 X) 

= h (S - 7 (X - X) - 7S - 7 [/) 

= /i((l- 7 )S- 7 f/- 7 (X-X)). (12) 
The second moment of a sum of two random variables A and B can be bounded as follows 

E [(A + Bf] = E [A 2 ] + E [B 2 ] + 2E [AB] 

Cauchy-Schwartz ineq. 

< E [A ] + E [S ] + 2y/E [A 2 ]a/E [B 2 ] 

( V ^4 2 T+ v^M) 2 , (13) 



with equality when A and B are aligned, i.e. A = \B for some A G K. For the random variable under 



consideration in (fl"2|), choosing A = (1 — 7)5 — 7?/, and P = — j(X — X) in ([T3]) 



E 



(1 - 7)5 - 7?7 - 7 (X - X) 



< v /(l-7)2a2 + 7 2p_2 7 (l-7)a 5C7 + 7 a/E [(X - X) 2 



.(14) 



Equality is obtained by aligning^] X — X with (1 — 7)5 — 7?/. Thus, 



J(S;X) > -log 2 (27rea 2 )-/i(S- 7 X) 



/ 



> -lot 



\ 



a 



\ 



(15) 



/ 



v/(l- 7 )V 2 + 7^-27(1-7)^ + 7y E [(X - X)< 
Using ([7]), I(S;X) < I(X;Y). Using the lower bound on I(S;X) from ( fl"5| ) and the upper bound on 



/(X; Y) from (fTDf, we get 
/ 



a 



V 



v /(l_ 7 )2 (T 2 + 7 2 P _2 7 (i- 7 ) a5[/ + 7 ^ E [(X - X> 
for the choice of £ m and X> m . Since log 2 (■) is a monotonically increasing function, 



< -Ioe 



P + a 2 + 



2asu \ 



J 



a 



y/(l - 7) V 2 + 7 2 P - 2 7 (1 - 7 W + 7\/E (X-X) 



< 1 + P + a 2 + 2<r sl/ 



i.e. V(l - 7) V 2 + 7 2 P - 2 T (1 - i)a su + (X-X) 2 



> 



<7 



Since 7 > 0, 7WE (X-X) 2 



> 



a- 



1 + P + a 2 + 2a S u 
- 7) V 2 + 7 2 ^ - 27(1 - l)(T S u- 



1 + P + a 2 + 2a su 

Because the RHS may not be positive, we take the maximum of zero and the RHS and obtain the following 
lower bound for £ m and T> m . 

1 



E 



(x-xy 



> 



r 



1 + P + a 2 + 2a su 



y/(l - 7 ) 2 <x 2 + 7 2 ^ - 27(1 - 7 W 



(16) 



2 In general, since X m is a function of Y m , this alignment is not actually possible when the recovery of X m is not exact. The derived 
bound is therefore loose. 



Because the bound holds for every 7 > 

(X - X) 2 



E 



> sup 

7>0 7 



(7 



1 + P + a 2 + 2<r st , 



V(l - 7) V 2 + 7 2 P - 2 7 (1 - 7 W 



(17) 



for the chosen £ rn and £> m . Now, from ([8]), 05(7 can take values in [— ayfP, a\fP\. Because the lower 
bound depends on £ m and V m only through a su , we obtain the following lower bound for all E m and 

1 



E 



(X-X) 2 > inf sup 



(7^ 



\a su \<ay/P 7 >0 l" 2 I \ V 1 + P + O" 2 + 2(T 



su 



^(1 _ 7 ) 2or 2 + 7 2p _ 2 7 (1 _ 7 ) (T5C/ 



(18) 



which proves Corollary [T| Notice that we did not take limits in m anywhere, and hence the lower bound 
holds for all values of m. ■ 



The case of nonzero rate 

To prove Theorem [TJ consider now the problem when the encoder wants to also communicate a message 
M reliably to the decoder at rate R. 

Using Fano's inequality, since Pr(M 7^ M) = e m — > as m — > 00, H(M\M) < m5 m where 5 m — > 0. 
Thus, 

I(M;M) = H(M)-H(M\M) 
= mR — H(M\M) 

> mR — m5 m = m(R — 5 m ) . (19) 

As before, we consider a mutual information inequality that follows directly from the Markov chain 

(M, S m ) -> X m Y m -> (X m , M) : 

J(M, S m ; M, X m ) < J(X m ; Y m ). (20) 
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The RHS can be bounded above as in (|5]). For the LHS, 

J(M,S m ;M,X m ) = J(M;M,X m ) + /(S m ;M,X m |M) 

> I(M;M) + /(S m ;M,X m |M) 

I(M; M) + h(S m \M) — h(S m \M, X m , M) 
Sm = M I(M;M) + h(S m )-h(S m \M,X m ,M) 

> I(M-M) + h(S m )-h(S m \X m ) 

> I(M;M) + /(S m ;X m ) 

using JSl ^ 

> I(M;M) +m/(S;X). (21) 



From (19), (20) and (21), we obtain 



^ using (19) , — _ ^ 

m(R - 5 m ) + mI(S; X) < I(M; M) + mI(S;X) 

using (21} „ — _ ^ 

< /(M,S m ;M,X m ) 



using \2Q) using (3) 

< J(X m ;Y m ) < mI(X;Y). (22) 



/(X; Y) and /(S 1 ; X) can be bounded as before in ( fT0| ) and ( |T5] ). Observing that as m — > oo, 5 m — > 0, 



we get the following lower bound on the MMSE for nonzero rate, 



1 / / a 2 2 2R 



MMSE(P,R) > inf sup- ,/__-__ ^(1 - 7 ) V 2 + 7 2 P - 2 7 (1 - 7 W 



(23) 



In the limit 5 m — > 0, we require from ([22]) that /(X; Y) > R. This gives the following constraint on asu, 

^log 2 (l + P + a 2 + 2a su ) >R 

2 2R -1-P- a 2 
i.e. asu > 2 ' (24) 

yielding (in conjunction with ([8])) the constraint on a S u m Theorem [T] The constraint on P in the 
Theorem follows from Costa's result (3), because the rate R must be smaller than the capacity over a 
power constrained AWGN channel with known interference, | log 2 (1 + P). ■ 
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It is insightful to see how the lower bound in Corollary [T] is an improvement over that in [J7J. The lower 
bound in [7] is given by 




a 2 



MMSE(P,0) > | ( W ff2 + p + wp+1 - | | (25) 




which again holds for all m. Because any 7 provides a valid lower bound in Corollary [TJ choosing 7 = 1 
in Corollary [T] provides the following (loosened) bound, 




a 2 



MMSE(P,0)> inf \[\H> --VP , (26) 

Wsu\<*Vp \\V a 2 + P + 2a su + 1 11 




which is minimized for asu = o V -P- This immediately yields the lower bound ( [25] ) of |7_ 



5. 77ze upper bound and the tightness at MMSE = 

We use the combination of linear and dirty-paper coding strategies of [FT), except that we communicate 
a message at rate R as well. We summarize the strategy briefly, and refer the interested reader to [|7| for 
a detailed description and analysis of the achievability. 

The encoder divides its input into two parts U™ n and U^L of powers Pn n and Pd pc respectively, such 
that P = Pun + Pdpc (by construction, U™ n and U^l c turn out to be orthogonal in the limit). We refer to 
Pii n as the linear part of the power, and Pd pc the dirty-paper coding part of the power. The linear part 
is used to scale the host signal down by a factor (5 (using UJ™ n = — /3S m ) so that the scaled down host 
signal has variance a 2 = a 2 (l — (3) 2 , where (3 2 a 2 = Pu n . Using the remaining Pd pc power, the transmitter 
dirty -paper codes against the scaled-down host signal (1 — /3)S m with the DPC parameter a |3j allowed 
to be arbitrary (unlike in OJ, where it is eventually chosen to be the MMSE parameter). 

A plain DPC strategy achieves the following rate [|3j Eq. (6)] 

1 / P(P + a 2 + l) \ 

The strategy recovers U m + aS m at the decoder with high probability. Because we also have a linear part 
here, the achieved rate is 

R ~ 2 bg2 {Pd pc a 2 (l-a) 2 + P dpc + a 2 a 2 ) " (28) 
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The decoder now decodes the codeword U^ c + a(l — (3)S m . It then performs an MMSE estimation for 
estimating X m = S m + U m = (1 - f3)S m + U^ c using the channel output Y m = (1 - f3)S m + U^ c + Z m 
and the decoded codeword a(l — /3)S m + U^ c . The obtained MMSE can now be minimized over the 



choice of a and /3 under the constraint (28). 



Corollary 2: For a given power P, a combination of linear and DPC-based strategies achieves the 
maximum rate C(P) in the perfect recovery limit MMSE(P, R) = 0, where C(P) is given by 

<w= - P M (Pga -t^2 +p ! 2Ja,) )- 

CTSC/e[ _ CT VP,o] 2 V a 2 (a 2 + P + 2^X51/) / 

Proof: 
The achievability 

The combination of linear and DPC-based strategies of f7| recovers U^ c + a(l — /3)S m at the decoder 
with high probability. In order to perfectly recover X m = (1 — (3)S m + U^ c , we can use a — 1, and 
hence the strategy would achieve a rate of 

I lo f Pd P c(Pd P c + o 2 + 1) 

Pun,Pdpc-P=Plin + Pdpc 2 2 V -^PC + 5 2 



P ach = sup - log 2 ( pc n pc ; ^ 2 ) , (30) 



where we take a supremum over Pu n , Pd pc such that they sum up to P. Let osu = — oVPm (note that 
as P/ in varies from to P, a su varies from to —ay/P). Then, P dpc = P — °^f, and P dpc + a 2 = 
Pdpc + cr 2 + Pu n - 2a^T\~ = P + a 2 + 2a SU - Thus, 



1 (P-^(P + ^ + 2a su + l) 

Rach= SUp "log 2 — T — |. (31) 

a su e[-aVP,o] 2 \ P + a 2 + 2a SU 

Simple algebra shows that this expression matches that in Corollary [2j 



The converse 

Since we are free to choose 7 in Theorem fil let 7 = 7* = ^p^a^ ■ Then, 1 — 7* 



P+VSI 



a 2 +P+2a S u ' 



Thus, we get 

-\ - 

I I I I a 2 2 2R 

> inf 




a su 7* \ \ V 1 + a 2 + P + 2a 



su 



_ r ) 2(T 2 + r * p _ 2r (1 _ 7 *) £rsi/ . (32) 
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It has to be the case that the term inside (•)+ is non-positive for some value of asu- This immediately 
yields 



i2R 



< 



sup 



^ ((1 - 7*) V 2 + 7 * 2 P - 2 7 *(1 - 7 >st/) (l + ° 2 + P + 2a su ) 



sup 2 



1 ((P + o^V 2 + (a 2 + a 5C7 ) 2 P - 2(P + ^(a 2 + a su )a S u) 



{a 2 + P + 2a su ) 2 



(1 + a 2 + P + 2a. 



St/ J 



1 (P 2 a 2 - g^a 2 + 2Pa su a 2 + Pa 4 - Pa^ - 2^) 



sup — ^ ^ ^ — + a 2 + P + 2a 5C /) 



sup 
sup 



1 ((Pa 2 -aI f/ )(P + a 2 + 2a^)) , 
a 1 (a 2 + P + 2(T 5l /) i 



(Pa 2 -a 2 ^)(l + a 2 + P + 2cr SC7 ) 

(7 2 ((7 2 + P + 2(75 [ /) 

Thus, we get the following upper bound on C(P), 



C(P) < sup 



log 2 



CT SE/ 



e[-oVF,o-/F| 



(Pa 2 -a 2 a )(l + a 2 + P + 2^) \ 
a 2 (a 2 + P + 2a S[/ ) J 



(33) 



The term (Pa 2 — a 2 su ) is oblivious to the sign of asu- However, the term 



1 + a 2 + P + 2<7 St/ 
a 2 + P + 2cxst/ 



1 + 



1 



a 2 + P + 2a 5C/ 



(34) 



is clearly larger for asu < if we fix \asu\- Thus the supremum in p3| ) is attained at some asu < 0, 
and we get 

1 /rPn- 2 - rr 2 „Vl + rr 2 + P + Sffor/W 

(35) 



1 , f(P° 2 ~ °su) (1 + <t 2 + P + 2^, 
C(P) < sup - log 2 



a 2 (a 2 + P + 2a su ) 



which matches the expression in ( |3Tj ). Thus for perfect reconstruction (MMSE = 0), the combination of 
linear and DPC strategies proposed in jTJ is optimal. ■ 



IV. Numerical results 

Witsenhausen's original control theoretic formulation seeks to minimize the sum of weighted costs 
k 2 P + MMSE. Fig. |2|b) shows that asymptotically, the ratio of upper and new lower bounds (from 
Corollary [T]) on the weighted cost is bounded by 1.3, an improvement over the ratio of 2 in [|7). The 
ridge of ratio 2 along a 2 = present in Fig. 2 a) (obtained using the old bound from [7|) does not 
exist with the new lower bound since this small-A; regime corresponds to target MMSEs close to zero - 
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where the new lower bound is tight. This is illustrated in Fig. [3] (top). Also shown in Fig. [3] (bottom) is 
the lack of tightness in the bounds at small P. The figure explains how this looseness results in the ridge 
along k ~ 1.67 still surviving in the new ratio plot. 

Fig. [4] shows the ratio of upper and lower bounds on MMSE(P, 0) versus P and a. While the ratio 
with the bound of (7J was unbounded (Fig. |4} top), the new ratio is bounded by a factor of 1.5 (Fig. |4} 
bottom). This is again a reflection of the tightness of the bound at small MMSE. A flipped perspective 
is shown in Fig. [5J where we compute the ratio of upper and lower bounds on required power to attain a 
specified MMSE. As further evidence of the lack of tightness in the small-P ("high distortion") regime, 

2 

the ratio of upper and lower bounds on required power diverges to infinity along the path MMSE = -^t\- 
Fig. [6] shows the upper and the lower bounds for R = 0.5. Again, the bounds are not tight in the 
small-P regime — now the looseness is at the lowest power P = 1 at which communication at R = 0.5 
is possible. As shown in Corollary [2j the bounds are still tight at MMSE = 0. Fig. [7] shows the upper 
and lower bounds on MMSE as a function of the rate R for fixed power P = 1 and a 2 equal to the 
Golden ratio. The figure demonstrates that beyond the maximum rate with zero distortion, the price of 
increasing rate is an increased distortion in the estimation of X m . 



The MATLAB code for these figures can be found in [15]. 
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consequence of the tightness lower bound at MMSE — 0, and hence for small k. A ridge remains along k « 1.67 (log 10 (fc) « 0.22) and 
large a, and this can be understood by observing Fig. [3] for a — 10. 
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(a) 




Fig. 3. Upper and lower bounds on asymptotic MMSE vs P for a — y 2 ~ (square-root of the Golden ratio; Fig. (a)) and a = 10 (b) 
for zero-rate (the vector Witsenhausen counterexample). Tangents are drawn to evaluate the total cost for k — \/0.l for a = v ^ ! ~ 1 , and 
for fc = 1.67 for a — 10 (slope = —k 2 ). The intercept on the MMSE axis of the tangent provides the respective bound on the total cost. 
The tangents to the upper bound and the new lower bound almost coincide for small values of k. At k ~ 1.67 and a — 10, however, our 
bound is not significantly better than that in |7 | and hence the ridge along k w 1.67 remains in the new ratio plot in Fig. [2] 
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Ratio of (linear+DPC) upper and the previous lower bound on MMSE Ratio of (linear+DPC) upper and the new lower bound on MMSE 




Fig. 4. Ratio of upper and lower bounds on MMSE vs P and a at R = 0. Whereas the ratio diverges to infinity with the old lower 
bound of 1 7 1 (top), it is bounded by 1.5 for the new bound (bottom). This is a consequence of the improved tightness of the new bound at 
small MMSE. 




Fig. 5. Ratio of upper and lower bounds on P vs MMSE and a at R — 0. Interestingly, the ratio increases to infinity as a — > oo along 

2 

the path where P is close to zero (corresponding to "high" MMSE — % +1 )■ 
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Fig. 6. Upper and lower bounds on P vs MMSE for a = J for R = 0.5. Though the bounds match at MMSE = (by 

Corollary |2j, the bounds do not match at the minimum power (P — 1 here) for nonzero rates. Below P = 1, communication at R — 0.5 is 
not possible. 
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Fig. 7. Plot of upper and lower bounds on MMSE vs rate for fixed power P = 1 and a = y ^ 1 . Higher rates require higher average 
distortion in the reconstruction of X m . 



